hydrology and earth system science
Leveraging Exogenous Signals for Hydrology Time Series Forecasting
He, Junyang, Fox, Judy, Jafari, Alireza, Chen, Ying-Jung, Fox, Geoffrey
Recent advances in time series research facilitate the development of foundation models. While many state-of-the-art time series foundation models have been introduced, few studies examine their effectiveness in specific downstream applications in physical science. This work investigates the role of integrating domain knowledge into time series models for hydrological rainfall-runoff modeling. Using the CAMELS-US dataset, which includes rainfall and runoff data from 671 locations with six time series streams and 30 static features, we compare baseline and foundation models. Results demonstrate that models incorporating comprehensive known exogenous inputs outperform more limited approaches, including foundation models. Notably, incorporating natural annual periodic time series contribute the most significant improvements.
- North America > United States > Virginia > Albemarle County > Charlottesville (0.05)
- North America > United States > Georgia > Fulton County > Atlanta (0.05)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
Efficacy of Temporal Fusion Transformers for Runoff Simulation
Koya, Sinan Rasiya, Roy, Tirthankar
Combining attention with recurrence has shown to be valuable in sequence modeling, including hydrological predictions. Here, we explore the strength of Temporal Fusion Transformers (TFTs) over Long Short-Term Memory (LSTM) networks in rainfall-runoff modeling. We train ten randomly initialized models, TFT and LSTM, for 531 CAMELS catchments in the US. We repeat the experiment with five subsets of the Caravan dataset, each representing catchments in the US, Australia, Brazil, Great Britain, and Chile. Then, the performance of the models, their variability regarding the catchment attributes, and the difference according to the datasets are assessed. Our findings show that TFT slightly outperforms LSTM, especially in simulating the midsection and peak of hydrographs. Furthermore, we show the ability of TFT to handle longer sequences and why it can be a better candidate for higher or larger catchments. Being an explainable AI technique, TFT identifies the key dynamic and static variables, providing valuable scientific insights. However, both TFT and LSTM exhibit a considerable drop in performance with the Caravan dataset, indicating possible data quality issues. Overall, the study highlights the potential of TFT in improving hydrological modeling and understanding.
- South America > Chile (0.25)
- South America > Brazil (0.25)
- Oceania > Australia (0.25)
- (3 more...)
Fine Flood Forecasts: Incorporating local data into global models through fine-tuning
Floods are the most common form of natural disaster and accurate flood forecasting is essential for early warning systems. Previous work has shown that machine learning (ML) models are a promising way to improve flood predictions when trained on large, geographically-diverse datasets. This requirement of global training can result in a loss of ownership for national forecasters who cannot easily adapt the models to improve performance in their region, preventing ML models from being operationally deployed. Furthermore, traditional hydrology research with physics-based models suggests that local data -- which in many cases is only accessible to local agencies -- is valuable for improving model performance. To address these concerns, we demonstrate a methodology of pre-training a model on a large, global dataset and then fine-tuning that model on data from individual basins. This results in performance increases, validating our hypothesis that there is extra information to be captured in local data. In particular, we show that performance increases are most significant in watersheds that underperform during global training. We provide a roadmap for national forecasters who wish to take ownership of global models using their own data, aiming to lower the barrier to operational deployment of ML-based hydrological forecast systems.
- South America > Brazil (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Update hydrological states or meteorological forcings? Comparing data assimilation methods for differentiable hydrologic models
Jamaat, Amirmoez, Song, Yalan, Rahmani, Farshid, Liu, Jiangtao, Lawson, Kathryn, Shen, Chaopeng
Data assimilation (DA) enables hydrologic models to update their internal states using near-real-time observations for more accurate forecasts. With deep neural networks like long short-term memory (LSTM), using either lagged observations as inputs (called "data integration") or variational DA has shown success in improving forecasts. However, it is unclear which methods are performant or optimal for physics-informed machine learning ("differentiable") models, which represent only a small amount of physically-meaningful states while using deep networks to supply parameters or missing processes. Here we developed variational DA methods for differentiable models, including optimizing adjusters for just precipitation data, just model internal hydrological states, or both. Our results demonstrated that differentiable streamflow models using the CAMELS dataset can benefit strongly and equivalently from variational DA as LSTM, with one-day lead time median Nash-Sutcliffe efficiency (NSE) elevated from 0.75 to 0.82. The resulting forecast matched or outperformed LSTM with DA in the eastern, northwestern, and central Great Plains regions of the conterminous United States. Both precipitation and state adjusters were needed to achieve these results, with the latter being substantially more effective on its own, and the former adding moderate benefits for high flows. Our DA framework does not need systematic training data and could serve as a practical DA scheme for whole river networks.
- North America > United States > California (0.04)
- North America > United States > Pennsylvania > Centre County > University Park (0.04)
- North America > United States > Alabama (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
A Deep State Space Model for Rainfall-Runoff Simulations
Wang, Yihan, Zhang, Lujun, Yu, Annan, Erichson, N. Benjamin, Yang, Tiantian
The rainfall-runoff relationship is a fundamental concept in hydrology. It describes how rainfall is transformed into surface runoff through interconnected hydrologic processes, such as infiltration, evapotranspiration, and the exchange of water between surface and subsurface flows (Beven & Kirkby, 1979). Thoroughly understanding these hydrologic processes and subsequently achieving accurate simulations of the rainfall-runoff relationship are critical for proactive flood forecasting and mitigation, efficient agricultural planning, and strategic urban development (Beven, 2012; Knapp et al., 1991; Moradkhani & Sorooshian, 2008). Physically-based hydrologic models (PBMs), grounded in physical laws that govern hydrologic dynamics, are the standard tools for simulating rainfall-runoff relationships (Beven, 1996). However, the highly nonlinear nature of various hydrologic processes often challenges PBMs, limiting their accuracy in diverse conditions (Beven, 1989; Clark et al., 2017). Consequently, there is a growing need for innovative approaches to address the limitations of PBMs. Deep learning (DL) has emerged as an alternative to PBMs, showing success in capturing the complex, nonlinear patterns in rainfall-runoff simulations. The hydrology community also explores the applicability of DL models in rainfall-runoff simulations across diverse temporal scales and geospatial locations.
- North America > United States > Virginia (0.04)
- North America > United States > Minnesota (0.04)
- North America > United States > West Virginia (0.04)
- (16 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Energy (0.68)
- Government > Military (0.68)
Using Machine Learning to Discover Parsimonious and Physically-Interpretable Representations of Catchment-Scale Rainfall-Runoff Dynamics
Wang, Yuan-Heng, Gupta, Hoshin V.
Despite the excellent real-world predictive performance of modern machine learning (ML) methods, many scientists remain hesitant to discard traditional physical-conceptual (PC) approaches due mainly to their relative interpretability, which contributes to credibility during decision-making. In this context, a currently underexplored aspect of ML is how to develop "minimally-optimal" representations that can facilitate better "insight regarding system functioning". Regardless of how this is achieved, it is arguably true that parsimonious representations better support the advancement of scientific understanding. Our own view is that ML-based modeling of geoscientific systems should be based in the use of computational units that are fundamentally interpretable by design. This paper continues our exploration of how the strengths of ML can be exploited in the service of better understanding via scientific investigation. Here, we use the Mass Conserving Perceptron (MCP) as the fundamental computational unit in a generic network architecture consisting of nodes arranged in series and parallel to explore several generic and important issues related to the use of observational data for constructing input-state-output models of dynamical systems. In the context of lumped catchment modeling, we show that physical interpretability and excellent predictive performance can both be achieved using a relatively parsimonious "distributed-state" multiple-flowpath network with context-dependent gating and "information sharing" across the nodes, suggesting that MCP-based modeling can play a significant role in application of ML to geoscientific investigation.
- Leisure & Entertainment (0.93)
- Energy > Oil & Gas (0.47)
A Mass-Conserving-Perceptron for Machine Learning-Based Modeling of Geoscientific Systems
Wang, Yuan-Heng, Gupta, Hoshin V.
Although decades of effort have been devoted to building Physical-Conceptual (PC) models for predicting the time-series evolution of geoscientific systems, recent work shows that Machine Learning (ML) based Gated Recurrent Neural Network technology can be used to develop models that are much more accurate. However, the difficulty of extracting physical understanding from ML-based models complicates their utility for enhancing scientific knowledge regarding system structure and function. Here, we propose a physically-interpretable Mass Conserving Perceptron (MCP) as a way to bridge the gap between PC-based and ML-based modeling approaches. The MCP exploits the inherent isomorphism between the directed graph structures underlying both PC models and GRNNs to explicitly represent the mass-conserving nature of physical processes while enabling the functional nature of such processes to be directly learned (in an interpretable manner) from available data using off-the-shelf ML technology. As a proof of concept, we investigate the functional expressivity (capacity) of the MCP, explore its ability to parsimoniously represent the rainfall-runoff (RR) dynamics of the Leaf River Basin, and demonstrate its utility for scientific hypothesis testing. To conclude, we discuss extensions of the concept to enable ML-based physical-conceptual representation of the coupled nature of mass-energy-information flows through geoscientific systems.
- North America > United States > Arizona > Pima County > Tucson (0.14)
- North America > United States > Mississippi (0.04)
- North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
- (11 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Energy (1.00)
- Water & Waste Management > Water Management (0.67)
Probing the limit of hydrologic predictability with the Transformer network
Liu, Jiangtao, Bian, Yuchen, Shen, Chaopeng
For a number of years since its introduction to hydrology, recurrent neural networks like long short-term memory (LSTM) have proven remarkably difficult to surpass in terms of daily hydrograph metrics on known, comparable benchmarks. Outside of hydrology, Transformers have now become the model of choice for sequential prediction tasks, making it a curious architecture to investigate. Here, we first show that a vanilla Transformer architecture is not competitive against LSTM on the widely benchmarked CAMELS dataset, and lagged especially for the high-flow metrics due to short-term processes. However, a recurrence-free variant of Transformer can obtain mixed comparisons with LSTM, producing the same Kling-Gupta efficiency coefficient (KGE), along with other metrics. The lack of advantages for the Transformer is linked to the Markovian nature of the hydrologic prediction problem. Similar to LSTM, the Transformer can also merge multiple forcing dataset to improve model performance. While the Transformer results are not higher than current state-of-the-art, we still learned some valuable lessons: (1) the vanilla Transformer architecture is not suitable for hydrologic modeling; (2) the proposed recurrence-free modification can improve Transformer performance so future work can continue to test more of such modifications; and (3) the prediction limits on the dataset should be close to the current state-of-the-art model. As a non-recurrent model, the Transformer may bear scale advantages for learning from bigger datasets and storing knowledge. This work serves as a reference point for future modifications of the model.
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Pennsylvania (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (4 more...)
Accurate Hydrologic Modeling Using Less Information
Shalev, Guy, El-Yaniv, Ran, Klotz, Daniel, Kratzert, Frederik, Metzger, Asher, Nevo, Sella
Joint models are a common and important tool in the intersect ion of machine learning and the physical sciences, particularly in contex ts where real-world measurements are scarce. Recent developments in rainfall-run off modeling, one of the prime challenges in hydrology, show the value of a joint m odel with shared representation in this important context. However, curren t state-of-the-art models depend on detailed and reliable attributes characteriz ing each site to help the model differentiate correctly between the behavior of diff erent sites. This dependency can present a challenge in data-poor regions. In this p aper, we show that we can replace the need for such location-specific attributes w ith a completely data-driven learned embedding, and match previous state-of-the -art results with less information.
- North America > United States (0.47)
- South America > Chile (0.05)
- Oceania > Australia (0.04)
- (6 more...)